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Annual  Report  --  Grant  DAMD17-94-J-4439 


INTRODUCTION 

Transcription  factors  bind  to  target  DNA  sequences  to  regulate  metabolic 
functions  such  as  growth  and  differentiation.  The  PU.1  (spi-1,  sfpi-1) 
transcription  factor  (1)  is  a  member  of  the  ets  gene  family,  a  recently 
discovered  family  of  regulatory  proteins.  There  are  now  more  than  45 
members  in  this  family  that  have  been  identified  in  various  organisms 
from  Drosophila  to  humans  (reviewed  in  Refs.  2  and  3).  These  molecules 
play  a  role  in  normal  development  and  have  been  implicated  in  malignant 
processes.  Important  for  these  studies  is  the  fact  that  ets-related 
proteins  have  been  identified  in  normal  mammary  cell-specific  gene 
expression  (4)  as  well  as  in  breast  cancer  cell  lines  (5-7). 

The  ets-related  proteins  have  been  proposed  to  regulate  gene  expression 
in  mammary  tissue  and  such  molecules  could  influence  the  production  of 
gene  products  that  are  responsible  for  human  breast  tumorigenesis  (7).  It 
has  been  shown  that  elevated  expression  of  the  PEA3  gene  is  directly 
correlated  with  the  development  of  metastatic  mammary  tumors  in 
transgenic  mice  with  the  neu  oncogene  (5).  Moreover,  in  25-30%  of 
primary  human  breast  cancers,  there  is  an  amplification  and  over¬ 
expression  of  the  HER2/neu  (cerbB-2)  proto-oncogene  (6).  Overexpression 
of  HER2  is  associated  with  more  aggressive  tumor  growth  and  reduced 
patient  survival  (6).  An  ets-related  response  element  has  been  found  in 
the  promoter  of  the  HER2  gene  and  deletion  analysis  of  this  promoter 
revealed  that  this  site  is  an  important  c/s-acting  element  for  HER2 
translational  activity  (7).  Thus,  an  ets  protein  present  in  these  cells 
stimulates  the  expression  of  HER2  and  may  be  a  contributing  factor  to  the 
development  of  breast  cancers.  It  has  also  been  found  that  L-plastin  is 
over-expressed  in  a  number  of  solid  tumors  (8),  while  there  are  four  ets- 
promoter  elements  on  the  L-plastin  gene  (9).  These  results,  together  with 
those  obtained  from  the  study  of  HER2  expression  strongly  implicate  ets 
proteins  in  the  development  and/or  metastatic  spread  of  human  breast 
tumors.  Therefore,  the  structural  models  that  are  to  be  generated  in  this 
study  will  be  an  important  contribution  to  the  future  design  of  therapeutic 
agents  to  modulate  tissue-specific  transcription  directed  by  ets 
molecules. 

The  ets  proteins  share  a  conserved  region  of  approximately  85  amino  acids 
known  as  the  ETS  domain  (10)  that  serves  as  a  DNA-binding  domain  and 
recognizes  a  purine-rich  sequence,  5'-C/AGGAA/T-3'.  Ets  proteins  differ 
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In  size  and  in  the  relative  position  of  the  ETS  domain  within  the  intact 
protein.  The  remaining  sequences  in  ets  proteins  are  presumed  to  form 
other  functional  domains  such  as  activation  domains  or  inhibitory  domains 
that  mask  the  DNA  binding  site.  The  ETS  domain  binds  to  DNA  as  a 
monomer,  unlike  many  other  DNA-binding  proteins. 

PU.1,  the  subject  of  this  study,  is  an  ets  protein  expressed  in 
hematopoietic  cells  and  specifically  in  immune  system  cells  such  as  B 
cells,  macrophages,  neutrophils  or  mast  cells  (1,2).  The  sequence  of  PU.1 
is  identical  with  the  oncogene  spi-1  (2).  Within  the  ets  family,  the  PU.1 
sequence  is  the  most  divergent  from  ets-1  and  yet  there  is  40%  sequence 
homology  in  the  DNA  binding  domain  of  these  two  proteins  (see  Figure  1). 

At  the  end  of  the  first  year  of  funding,  we  reported  the  successful 
crystallization  of  the  PU.1  ETS  domain  in  complex  with  DNA.  In  the  second 
year  of  funding,  we  have  solved  the  crystal  structure  of  this  complex. 

This  Is  the  first  ets  protein  to  be  crystallized  and,  to  date,  this  is  the 
only  crystal  structure  available  for  an  ets  protein.  We  succeeded  by 
varying  the  length  of  both  protein  and  DNA  fragments  in  the  complex.  Last 
year,  we  published  the  methods  used  for  the  co-crystallization  of  the 
complex  (11).  Because  of  the  strong  sequence  homology  of  the  DNA- 
binding  domains,  we  propose  that  similar  strategies  may  be  useful  for 
crystallization  of  ETS  domains  from  other  members  of  the  ets  family.  In 
the  present  annual  report,  we  will  describe  the  crystal  structure  of  the 
complex  and  also  present  significant  advances  that  we  have  made  toward 
the  solution  structure  by  nuclear  magnetic  resonance  studies  of  the 
unbound  domain. 

BODY  -  PROGRESS  REPORT 

During  the  first  12  months  of  funding,  we  accomplished  the  first  two 
tasks  set  forth  in  the  statement  of  work  of  the  original  application.  The 
focus  of  our  efforts  during  months  13  to  24  was  primarily  on  Tasks  3  and 
4  and  these  efforts  will  be  outlined  in  detail  in  the  following  sections. 

Task  1.  Large  scale  purification  of  the  PU.1  DNA-binding  domain. 
Months  1-36 

Milligram  quantities  of  the  DNA-binding  domain  of  PU.1  were  expressed  in 
bacteria  and  purified  to  homogeneity  (11).  The  protocols  developed  were 
scaled  up  to  production  level  and  were  standardized  and  made  closely 
reproducible.  These  protocols  were  critical  for  the  production  of  large 
crystals.  Interestingly,  the  longer  domain  fragment  that  was  optimal  for 
crystallization  was  not  ideal  for  the  nuclear  magnetic  resonance  (NMR) 
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studies  in  solution.  Therefore,  we  had  to  produce  another  protein 
fragment  for  the  solution  studies  described  under  Task  3.  Despite 
implementing  different  purification  protocols  for  the  fragment  and 
verifying  that  the  fragment  was  intact  after  time  in  the  NMR  tube,  the 
resonance  spectra  were  not  complete  (see  report  under  Task  3). 
Crosspeaks  were  only  seen  for  94  of  the  111  expected  backbone  amide 
protons.  By  this  time,  from  the  crystal  structure  analyses  described 
under  Task  4,  we  already  knew  that  there  Is  inherent  flexibility  at  the 
amino-  and  carboxyl-terminal  ends  of  this  fragment  since  there  were  11 
disordered  residues  at  the  amino-terminus  and  14  disordered  residues  at 
the  carboxyl-terminus  that  were  not  visible  In  the  electron  density  map. 
Therefore  we  generated  a  new  fragment  specifically  for  the  NMR  studies 
that  is  shorter  and  expressed  from  a  different  vector  system  (pET3)  to 
optimize  yield  for  labelling.  The  new  fragment  is  93  residues  in  length 
and  includes  an  N-terminal  methionine  for  bacterial  expression.  This 
fragment  was  purified  to  homogeneity  and  tested  for  stability  by  gel 
electrophoresis  (see  Figure  2).  Improved  definition  of  the  growth  media 
and  culture  conditions  are  now  providing  consistently  high  yields  (4-6 
mg/liter  bacterial  culture)  of  soluble  protein.  More  recently,  we  have 
produced  labelled  protein  for  the  high  resolution  analyses  described  in 
Task  3.  The  efforts  to  modify  the  production  of  protein  fragments  have 
been  highly  critical  for  the  progress  that  we  report  for  the  NMR  studies 
over  the  the  past  year. 


Task  2.  Synthesis  of  DNA  oligonucleotides.  Months  1-18 

DNA  oligonucleotides  were  screened  for  binding  in  complex  with  the  PU.1 
domain  and  those  that  promoted  crystallization  of  the  complex  were 
selected  for  final  screening.  Ultimately,  a  sixteen  base-pair 
oligonucleotide  with  the  sequence  5'-AAAAAGGGGAAGTGGG-3'  and  the 
complementary  strand  were  synthesized  on  the  ten  micromolar  scale, 
purified  by  reverse  phase  HPLC  chromatography  and  annealed.  These 
procedures  were  optimized  to  produce  fragments  that  promoted  the 
growth  of  large  crystals  of  the  complex  (11).  It  was  evident  in  the 
electron  density  map  of  the  complex  that  the  DNA  fragments  formed  long 
extended  fiber-like  elements  in  the  crystal  lattice  by  end-to  end-stacking 
between  adjacent  oligonucleotides,  and  that  this  was  a  major  Interaction 
for  the  nucleation  of  crystal  growth. 
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Task  3.  Determination  of  the  soiution  structure  of  the  PU.1 
domain  by  NMR.  Months  1-36 

With  the  shorter  fragment,  prepared  specifically  for  NMR,  we  have  already 
proceeded  directly  to  experiments  with  labelled  protein  for  the  NMR 
analysis.  Our  earlier  studies  with  the  longer  fragment  were  slowed 
because  the  absence  of  17  expected  backbone  peaks  makes  the  NMR 
analyses  more  difficult  and  ambigous.  This  new  sample  should  accelerate 
our  progress.  The  HSQC  (Heteronuclear  Single  Quantum  Correlation) 
spectrum  for  the  94-resldue  fragment  labelled  with  isotopic  nitrogen  is 
presented  in  Figure  3.  This  experiment  shows  exclusively  proton 
resonances  attached  to  a  labelled  "I^N  nucleus.  In  the  spectrum  each  spot 
represents  a  backbone  amide  proton.  This  spectrum  also  shows  protons 
from  amino  acid  side  chains  that  contain  nitrogen.  For  example,  the  three 
spots  in  the  lower  left  of  the  figure  result  from  tryptophan  side  chains 
and  the  resonances  from  glutamines  and  asparagines  are  indicated  in  the 
upper  right  in  the  figure.  Here  we  can  identify  all  the  expected  peaks:  93 
observable  amide  protons  from  the  main  chain  of  the  protein,  7  x  2  for  the 
glutamines  and  asparagine  side  chains  (each  side  chain  has  two  protons 
attached  to  nitrogen),  and  3  indole  (NH)  protons  from  each  of  three 
tryptophans. 

These  results  represent  the  needed  unambigous  data  essential  for  the  NMR 
structural  studies.  The  full  accounting  of  all  expected  amide  crosspeaks 
shown  in  Figure  3  allows  us  to  proceed  directly  to  more  experiments  that 
link  all  the  residues  properly  in  their  position  and  to  push  towards  a 
tertiary  structure  of  this  domain  with  no  gaps  in  the  backbone 
assignments.  Experiments  to  produce  doubly-labelled  ('•3c,‘*5n)  protein 
are  underway. 

In  the  process  of  generating  spectra  from  the  short  fragment,  we  have 
also  observed  in  the  initial  homonuclear  spectra  (i.e.,  protons  only)  that 
one  of  the  tryptophan  side  chains  may  populate  more  than  one 
conformation.  In  the  crystal  structure,  two  tryptophans  are  adjacent  in 
the  p-sheet  (strand  pi).  The  third  tryptophan  is  located  in  helix  a2  and 
contacts  the  DNA  backbone  in  the  complex.  When  glycine  is  substituted 
for  this  tryptophan,  DNA  binding  is  lost  (12).  As  an  example  of  how  the 
NMR  and  crystal  structure  analyses  are  complementary,  we  are  testing  the 
effect  of  temperature  and  concentration  on  the  "moving"  tryptophan. 
Chemical  shift  analyses  suggest  that  this  residue  corresponds  to  Trp215 
that  contacts  the  DNA.  We  plan  to  monitor  the  effects  of  motion  and 
flexibility  of  this  tryptophan  by  NMR  dynamics.  The  time  frame  for  this 
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particular  motion  and  the  conformational  adjustment  upon  titration  with 
DNA  is  an  important  issue  to  evaluate  the  role  of  this  aromatic  side  chain 
in  DNA  recognition.  These  results  have  to  be  confirmed  also  in  the 
15N/13c-iabelled  sample  for  proper  analysis. 


Task  4:  Determination  of  the  crystal  structure  of  the  PU.1 
domain  complexed  to  DNA.  Months  1-36 

Production  of  large  crystals  of  the  PU.1  ETS  domain  in  complex  with  a  16 
base-pair  synthetic  oligonucleotide  containing  the  recognition  sequence 
was  achieved  and  reported  in  last  year's  report  and  published  (11;  reprint 
included  in  Appendix).  Crystals  formed  in  the  space  group  C2  with  a=89.1, 
£>=101.9,  c=55.6A  and  p=111.2°,  with  two  complexes  in  the  asymmetric 
unit.  Four  heavy  atom  derivatives  were  prepared  by  soaking  crystals  and 
by  co-crystallizing  with  iodinated  oligonucleotides.  The  locations  of  the 
iodinated  bases  were  also  used  to  orient  the  DNA  in  the  electron  density 
map.  Multiple  isomorphous  replacement  phases  plus  anomalous  scattering 
were  used  to  calculate  initial  electron  density  maps  at  3  A  resolution. 

The  initial  MIRAS  map  was  improved  by  solvent  flattening  by  the  method 
of  Wang  (13).  The  improved  electron  density  map  was  used  to  build  the 
model.  The  density  for  the  DNA  helix  was  a  prominent  feature  of  the  map. 
After  the  DNA  was  positioned,  the  polypeptide  backbone  was  fitted  with 
polyalanine  and  finally  side  chains  were  added  to  the  model.  There  were 
11  disordered  residues  at  the  amino-terminus  and  14  disordered  residues 
at  the  C-terminus  so  these  amino  acids  were  not  included  in  the  model. 

For  all  other  residues  representing  the  complete  ETS  domain,  the  electron 
density  was  clear  and  allowed  unambiguous  fitting  of  backbone  and  side 
chain  atoms.  Stereochemistry  was  optimized  to  ideal  bond  and  angle 
parameters  in  X-PLOR  (14).  One  cycle  of  simulated  annealing  was 
followed  by  alternate  cycles  of  manual  model  building  and  refinement. 

The  structure  was  reported  (12;  reprint  included  in  Appendix)  at  2.3A 
resolution  and  the  high  quality  of  the  model  was  demonstrated  with  a 
crystallographic  R-factor  of  23.7  %  and  Rfree  29.9%. 

This  is  the  first  crystal  structure  determined  for  an  ets  protein.  The  PU.1 
domain  assumes  a  tight  globular  structure  (33  x  34  x  38  A3)  formed  by 
three  a-helices  and  a  four-stranded  antiparallel  p-sheet  (see  Figure  4). 
The  domain  topology  is  similar  to  the  structures  of  other  ets  family 
proteins  fli-1  (15),  murine  ets-1  (16)  and  human  ets-1  (17)  determined  in 
solution  by  NMR.  The  structural  studies  revealed  that  ETS  domains  have  a 
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common  folding  pattern  that  is  similar  to  a+p  helix-turn-helix  (HTH)  DNA- 
binding  proteins  including  CAP  (18)  and  resembles  'winged'  HTH  proteins 
such  as  HNF-3y  (19).  The  domain  contacts  DMA  from  three  sites:  the 
recognition  helix  (a3),  the  loop  between  p-strands  3  and  4  (a  'wing')  and 
the  turn  in  the  HTH  motif  (a2-turn-a3).  This  turn  is  longer  than  the 
equivalent  in  many  other  HTH  proteins  and  is  actually  a  loop.  As  shown  in 
Figure  4,  the  PU.1  domain  and  probably  other  members  of  the  ets  family, 
use  a  DNA-binding  motif  that  can  more  appropriately  be  called  a  loop- 
helix-loop  motif.  Therefore  our  crystal  structure  reveals  a  new  pattern 
for  HTH  recognition  and  a  novel  mode  of  DNA-binding. 

Four  strictly  conserved  residues  on  the  surface  of  the  domain  are  likely  to 
be  important  for  DNA-binding  by  all  members  of  the  ets  family.  Arg232 
and  Arg235  emanate  from  the  recognition  helix  and  contact  the  conserved 
GGAA  sequence  in  the  major  groove  of  DNA.  Lys245  from  the  'wing' 
contacts  the  phosphate  backbone  of  the  GGAA  strand  in  the  minor  groove 
upstream  from  the  core  sequence  (see  Figure  5)  and  Lys219  in  the  loop  of 
the  HTH  motif  forms  a  salt  bridge  with  the  phosphate  backbone  of  the 
opposite  strand  downstream  of  the  GGAA  core.  Substitutions  of  glycine  at 
each  of  these  four  conserved  sites  abolished  DNA  binding,  confirming  the 
functional  importance  of  these  residues.  Thus  these  interactions 
represent  the  paradigm  for  ets  recognition  which  is  expected  to  be 
reproduced  in  all  ets  proteins. 

Water  molecules  also  participate  in  protein-DNA  recognition  in  the  PU.1 
complex.  There  are  27  well-defined  solvent  molecules  around  the  DNA.  In 
the  major  groove,  some  of  these  solvent  molecules  are  hydrogen-bonded  to 
the  bases  and  also  form  a  hydrogen-bonded  water  network  between  the 
two  strands  that  might  contribute  to  the  stability  of  the  duplex  and 
influence  DNA  recognition.  The  two  conserved  arginines  make  water- 
mediated  contacts  with  the  DNA. 

The  DNA  is  bent  in  the  complex  (8°)  when  compared  to  'canonical'  B-DNA 
structure  and  is  curved  uniformly  along  the  entire  16  bp  length.  The  minor 
groove  is  slightly  enlarged  (~2-3A  from  the  mean)  in  the  GGAA  region  at 
the  midpoint  of  the  oligonucleotide.  Surprisingly,  the  protein-DNA 
interactions  reported  in  the  NMR  structure  of  a  human  ets-1-DNA  complex 
(17)  differed  dramatically  from  this  pattern,  involving  different  contacts 
and  significant  (60°  kink)  DNA  deformation.  The  molecular  basis  for  the 
kinked  DNA  cannot  be  understood  in  the  context  of  the  contacts  seen  in  the 
PU.1  complex.  In  the  ets-1  structure,  this  kink  is  induced  by 
intercalation  of  the  side  chain  of  a  tryptophan  between  bases  6  and  7.  The 
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equivalent  of  this  tryptophan,  Tyr175  in  PU.1,  is  located  in  the 
hydrophobic  core,  excluding  the  possibility  for  intercalation  with  the  DNA 
bases.  Substitution  of  glycine  for  this  tyrosine  did  not  affect  DNA  binding 
in  PU.1.  Furthermore,  the  site  of  intercalation  is  located  at  the  opposite 
extreme  of  the  DNA  duplex,  upstream  of  the  GGAA  core  sequence;  that  is, 
the  ets-1  protein  is  docked  on  the  DNA  180°  from  the  position  of  the  PU.1 
domain.  Because  of  this  difference  in  orientation,  in  the  ets-1  complex, 
the  conserved  arginines  do  not  contact  the  GGAA  bases.  The  striking 
distinction  between  the  PU.1  and  ets-1  structures  could  reflect  extreme 
evolutionary  divergence  between  members  of  the  ets  family  although  this 
Is  highly  unlikely.  Alternatively,  it  should  be  noted  that  the  ets-1 
complex  was  formed  under  denaturing  conditions  and  it  is  possible  that 
the  Trp  intercalation  occurred  early  in  the  renaturation  step  with 
subsequent  protein  refolding. 

Currently,  the  PU.1  contacts  are  being  tested  with  mutagenesis  to 
understand  these  interactions  with  respect  to  biological  function  and 
other  residues  are  also  being  substituted  to  identify  residues  that 
mediate  recognition  of  a  specific  DNA  sequence  by  a  given  family  member. 


Conclusions 

The  work  accomplished  during  the  past  budget  period  has  been  a 
significant  contribution  to  our  understanding  of  the  way  that  ets  proteins 
recognize  DNA.  We  have  successfully  produced  that  first  crystal  structure 
of  an  ets  protein  and  the  model  will  serve  as  the  basis  to  begin  to 
describe  the  atomic  detail  of  protein-DNA  contacts  for  this  family. 

During  the  next  budget  period,  the  model  will  be  refined  and  more  solvent 
atoms  will  be  added.  Also,  we  will  extend  the  resolution  by  incorporating 
high  resolution  data  collected  at  a  synchrotron  source.  We  will  proceed 
with  the  NMR  analysis  of  the  unbound  domain.  The  sequential  assignment 
and  assignments  of  the  secondary  and  tertiary  structure  contacts  will  be 
made.  In  complementary  NMR  studies,  we  will  emphasize  dynamics 
measurements  to  assess  conformational  adjustments  that  may  occur  on 
DNA  binding  by  titration  of  the  PU.1  solution.  Besides  residues  that 
contact  DNA,  all  other  structural  elements  (helices,  loops,  etc.)  will  be 
monitored  by  dynamic  methods  since  proteins  and  protein  complexes  have 
Intrinsic  fluctuations  that  may  influence  biological  activity.  In  this 
manner,  the  comparison  of  the  structure  of  the  domain  complexed  with 
DNA  in  the  crystal  and  the  domain  alone  in  solution  by  NMR  will  provide 
valuable  information  on  the  process  of  DNA  recognition  by  ETS  domains. 
The  atomic  models  will  be  used  to  suggest  sites  of  mutation  to  learn  more 
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about  the  specific  mode  of  DNA  binding  by  individual  family  members.  The 
results  of  mutagenesis  on  PU.1  as  well  as  other  ets  proteins  will  be 
correlated  with  the  model,  highlighting  base  contacts,  phosphate  backbone 
contacts  and  water-mediated  contacts. 
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APPENDIX 

Five  figures  and  two  reprints  appended 


14 


f  ' 


o 

LO- 

CM 


O 

<N 


O 

CO- 

C\J 


o 

CVi- 

CVJ 


^  ^  ^  ^  ^ 

I  I  I  I  I 

^  pc:  cc  0^  p::  q:: 

p:  u::,  u:,  ^ 

o  o  o  o  u 

<  ^  cn  c  c 


W  M  H  M  H 


551^ 

Ph—.] 

isn?i 

>  o  u:  o 


>  < 
D^  I 


> 

I 


><<<<>>>> 
[i^  ^ 


>  J  S  Eh  > 

j  ^  Pl, 

I  I  I  I 


>H  >H  >H  >H  Pn 

I  I  I  I  I  I  I  I  I  I  I  t  I 
Oip:p:,p:oip^p::pco^p^p:u:i^p:^piu:,pip:ipcp:p:, 

P^OOOOOOOOOOOOOOOOOOOOOO 


pa 


szszsi 


a  a 


<  <  w  w  xx_x  X  a 


szs: 


1^  ^  X  X  w 


H  H 

O  X 
a 

>H  >H 


X  .  . 

Q  D 


>H  >H  >1 


"^' . 1^—3T 


a:  p:  nr.  pr.  TyT 


O  >H 

...a:  .x' 


<:  <  CO  (X  <  X 


s 


HEHEH^uxxacyxxox 


Q  O  O  C 


hhhhhShhhh 
OXWXXQXSOO 

aQPQQP^^WW 


9  a  Q  Q  a 


a  _  _ _ _  _ _ _  _  _ 

<  in  fit  o  o  '<c  .<  <  <:  <  rtC  <  fiC  c  (n  co’  <c’<  co  < 
iX..x2.^£SSS2S!SSS2x  Q.--^-g^-cK . S'-gg--’g 


>H  >H 


>H  >< 


Ji 


^  O 


X  O  X  o 
J  X  X  X 


P  o 


. ix  .  >L 


w 

>H 


O  X  X 

>H  >H 


■  w 

>H 


>^>^>^>^>H>H>H>^>^>^X>^>^>^>^>^>^ 

■X-:j:^....a:....x  .x  x  x  x~Q:~gr'~g"'x  x  x  x  q::....x.-j 


O  CO 


EZSZ 


cococococococococoa^ococococococococncoc/}cocoa} 


aQODjXMPapaQpapapapapaQQQQpjQQQQQpaQQpqQpqQ 


ti^:^PL){^pupLiDLiPPPPPPPPCItPPPPPPPPP 

xuxssssscnxcsxxxwwcocoxxxxx 

O  I  O  I  I  I  I  i  I  I  I  I  I  I  I  I  I  I  I  t  I  I  I  ! 

X  ^  i^r-g 


.  O  Q  CO  o 

p  p  p  p  p  p 

X  0^ 
o  X  X  p:  s 

I  I  I  I  I 

L>a..:x:, ...:<: 


oxoxxxxpixxxxxxxxxxxxxxooxxoxxaxo 

HxaxxxHXPaxpaxxpaxpapgpapaoHi-tHHPXHXXHpjH 


P 

P 

p 

p 

P 

P 

P 

P 

P 

p 

P 

p 

P 

P 

P 

P 

p 

P 

P 

P 

p 

P 

p 

p 

p 

P 

p 

P 

P 

P 

p 

X 

X 

p 

p 

P 

P 

P 

P 

P 

P 

p 

P 

p 

P 

P 

P 

P 

p 

p 

P 

g 

P 

p 

P 

p 

p 

p 

P 

p 

P 

P 

P 

p 

o 

< 

c 

< 

< 

C 

n 

C 

< 

< 

C 

c 

C 

P 

C 

< 

< 

c 

< 

< 

< 

< 

< 

< 

c 

«< 

CO 

< 

—  fit 

< 

w 

p 

P 

P 

iPBi 

P 

P 

P 

P 

p 

P 

p 

P 

P 

< 

P 

p 

p 

p 

p 

P 

p 

P 

p 

CD 

p 

MH 

CD 

P 

pq 

(M 

pq 

w 

Q 

w 

P 

P 

P 

P 

P 

P 

> 

P 

p 

P 

P 

P 

P 

p 

p 

p 

p 

P 

p 

P 

p 

p 

X 

p 

p 

P 

< 

P 

P 

P  P 

p 

P 

p 

P 

P 

04 

P 

p 

P 

p 

P 

p 

P 

P 

P 

P 

p 

p 

p 

p 

<1 

< 

P 

p 

<r: 

p 

p 

p 

CO 

P 

P 

P 

X 

X 

P 

X 

P 

P 

P 

P 

P 

P 

p 

P 

p 

P 

P 

P 

P 

p 

p 

p 

g 

g 

g 

P 

p 

Q 

p 

p 

p 

p 

P 

P 

P 

X 

p 

1 

p 

1 

1 

1 

1 

1 

1 

1 

1 

I 

1 

1 

i 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

CO 

CO 

1 

CO 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

t 

1 

1 

1 

i 

1 

1 

1 

1 

1 

1 

I 

i 

CO 

P 

CO 

in 

ir^ 

Eh 

Eh 

P 

Eh 

p 

H 

p 

P 

Eh 

Eh 

X 

p 

p 

H 

M 

H 

p 

CD 

H 

O 

CM 

C2CL 

g  > 

maSm 

g 

P 

* 

P 

P 

P 

P 

P 

p 

P 

™ 

p 

P 

P 

P 

P 

m 

p 

P 

p 

p 

n 

p 

■BH 

P 

P 

™ 

p 

p 

P 

> 

p 

P 

g 

NIK 

P 

ED 

O 

—  H 

W 

P 

P 

P 

H 

p 

P 

H 

p 

P 

W 

■  H 

p 

C\J 

O 

O 

o 

CD 

:s 

:5 

:s 

:s 

o 

CD 

CD 

:s 

s 

CD 

CD 

o 

CD 

CD 

CD 

CD 

CD 

o 

1 

X 

X 

CD 

P 

X 

>H 

CD 

CD 

p 

X 

P 

P 

>-1 

< 

1 

1 

1 

i 

1 

1 

P 

1 

1 

1 

1 

P 

1 

1 

1 

i 

1 

1 

1 

1 

CD 

P 

1 

p 

Eh 

1 

1 

Q 

P 

w 

CD 

CD 

CD 

CD 

CD 

CD 

X 

P 

P 

CD 

CD 

P 

P 

X 

X 

X 

X 

P 

P 

X 

CD 

CD 

p 

P 

CD 

P 

p 

p 

in 

CD 

X 

P 

o 

P 

P 

P 

P 

P 

P 

H 

P 

g 

P 

P 

H 

P 

H 

H 

H 

P 

X 

2 

P 

P 

p 

P 

P 

CD 

p 

p 

X 

P 

D 

P 

a 

W 

CD 

CD 

CD 

CD 

O 

O 

P 

CD 

U 

CD 

CD 

g 

CD 

CD 

CD 

CD 

CD 

p 

P 

CD 

CD 

p 

P 

CD 

g 

X 

a 

CD 

CD 

o 

o- 

> 

CQ. 

g 

WS!^ 

t 

H 

P 

H 

sa 

H 

in 

H 

p 

H 

ira 

P 

H 

wsm 

P 

p 

p 

H 

WS^A 

P 

H 

P 

> 

P 

H 

P 

P 

Eh 

P 

Eh 

P 

Eh 

P 

Eh 

P 

Eh 

u 

Eh 

Eh  Eh 
MSSiMieM 

Eh 

pass 

P 

P 

Eh  g  Eh  P 

Eh 

Eh 

liWlJAwMlHN^Hysx^: 

—  CO 

CO 

1 

u 

P 

p 

p 

Ki^Sn 

P 

H 

U 

C 

c 

P 

P 

> 

>H 

CD 

CD 

c; 

CD 

CJ 

X 

X 

p 

P 

oms 

H 

P 

P 

> 

P 

p 

U 

P 

T~ 

p 

p 

> 

w 

P 

p 

p 

P 

X 

P 

Eh 

P 

P 

P 

P 

P 

P 

p 

P 

X 

X 

X 

X 

X 

X 

X 

X 

P 

p 

P 

X 

p 

p 

X 

p 

g 

g 

g 

g 

g 

< 

u 

g 

g 

Eh 

P 

< 

< 

p 

§ 

P 

p 

p 

C 

fit 

CD 

p 

< 

p 

P 

p 

P 

P 

X 

X 

p 

X 

o 

o 

o 

u 

o 

X 

X 

o 

a 

X 

u 

X 

X 

X 

X 

fit 

X 

X 

X 

X 

X 

>H 

X 

p 

o 

>H 

P 

X 

I 

1 

1 

1 

1 

1 

1 

1 

1 

1 

s 

1 

1 

! 

1 

Eh 

1 

1 

1 

1 

t 

1 

1 

) 

1 

1 

1 

1 

1 

>H 

p 

1 

1 

P 

P 

p 

p 

in 

p 

p 

Fh 

X 

g 

p 

p 

P 

P 

< 

< 

p 

p 

p 

P 

g 

g 

Eh 

Fh 

CD 

P 

c 

p 

P 

g  <d 

p 

o 

n 

w 

CD 

P 

p 

p 

p 

p 

p 

p 

p 

p 

P 

CD 

P 

in 

p 

p 

P 

P 

p 

p 

P 

p 

g 

in 

p 

p 

P 

CD 

p 

CO 

CO 

p 

P 

P 

p 

p 

p 

p 

p 

p 

p 

p 

Q 

Q 

P 

p 

p 

p 

P 

P 

p 

p 

Q 

Q 

p 

P 

p 

p 

P 

p 

P 

p 

p 

1 

p 

1 

mem. 

o 

1 

E-< 

1 

1 

p 

1 

tn 

i 

™ 

p 

1 

1 

p 

1 

mm 

<c 

1 

c 

1 

ISS2(^ 

tH 

t 

Eh 

1 

St* 

1 

g 

1 

p 

1 

p 

1 

p 

1 

mm 

P 

1 

Eh 

1 

g  g 

1  1 

Q 

1 

Q 

1 

p 

1 

p 

p 

1 

g 

1 

aw 

g 

1 

p 

X 

g 

1 

p 

1 

o 

00 

—  P 

in 

P 

w 

Ifflf 

CD 

m 

p 

p 

p 

p 

niwraw 

p 

P 

m 

p 

mm 

< 

P 

p 

mm 

p 

wm 

p 

P 

sn 

g 

^mmssm 

g  < 

< 

g 

g 

Eh 

p 

wm 

p 

g 

in 

p 

fit 

i^Hnanii^ 

a 

P 

p 

P 

p 

MS 

p 

p 

p 

p 

H 

P 

nn 

p 

p 

p 

p 

P 

p 

sn 

p 

p 

mst 

P 

p 

BM 

p 

p 

> . > 

mimm 

p 

mm 

p 

mm 

.2L 

ma 

p 

mm 

IK 

o 

WSM 

mM 

X 

n 

am 

m 

g 

mm 

& 

& 

M 

n 

mm 

a 

mm 

a 

mM 

g 

a 

*g* 

H 

g 

g 

wmum 

mm 

a 

n 

g 

mm 

g 

mm 

n 

n 

g 

g 

la 

wm 

g 

g 

apaaB 

m 

g 

WS& 

P 

mm 

g 

0 

mm 

X 

mm 

mm 

p 

mm 

mm 

p 

mm 

mm 

a 

mm 

g 

Kc^: 

P 

o  p  a  o  g 

g 

g 

g 

X 

g 

g 

a 

g 

g 

>H 

X75 

g 

g 

g 

g 

'SIM 

Eh 

Eh 

g 

g 

Eh 

p 

g 

g 

>H 

p 

g 

g 

H  :§  p 

H 

H 

p 

H 

H 

H 

M 

H 

H 

H 

H 

H 

H 

> 

H 

M 

H 

H 

H 

H 

H 

1-1 

p 

p 

> 

p 

p 

H 

H 

p 

H 

p 

o 

p 

P 

g 

p 

P 

P 

P 

P 

p 

g 

X 

g 

P 

p 

g 

Eh 

g 

g 

g 

C 

< 

in 

o 

in 

Eh 

CD 

P 

p 

h- 

—  p 

P 

1 

p 

CD 

CD 

CD 

CD 

CD 

CD 

p 

CD 

CD 

CD 

CD 

X 

CD 

CD 

CD 

P 

CD 

p 

p 

CD 

P 

X 

CD 

CD 

T~ 

p 

P 

1 

p 

P 

P 

P 

P 

P 

p 

CD 

CD 

P 

p 

CD 

p 

p 

P 

P 

p 

C 

p 

O 

1 

Eh 

p 

p 

CO 

CO 

1 

EH 

CD 

O 

CD 

CD 

1 

CD 

CD 

p 

CD 

CD 

X 

P 

CD 

CD) 

P 

X 

P 

X 

H 

p 

X 

X 

p 

o 

o 

i 

CD 

1 

1 

1 

1 

1 

1 

CD 

1 

1 

CD 

CD 

1 

1 

i 

1 

g 

1 

P 

g 

1 

P 

w 

X 


w 

CO 

D 

O 

X 


D 

P 


W  P 


< 

P 

fit 

P 

P 

P 

P 

P 

p 

< 

X 

6 

P 

p 

p 

s 

X 

s 

s 

P 

P 

P 

P 

p 

X 

> 

X 

X 

X 

u 

P 

X 

s 

P 

2 

> 

P 

in 

in 

2 

2 

2 

2 

2 

X 

Q 

in 

p 

X 

o 

Eh 

o 

o 

o 

H 

M 

o 

X 

X 

X 

X 

Eh 

P 

X 

P 

P 

X 

X 

X 

X 

X 

o 

P 

P 

p 

p 

>H 

p 

p 

p 

X 

> 

p 

D 

P 

p 

p 

>< 

O 

o 

o 

P 

P 

P 

P 

P 

p 

P 

o 

O 

p 

p 

p 

p 

p 

p 

u 

C 

p 

X 

X 

X 

X 

P 

X 

X 

X 

X 

X 

X 

X 

X 

X 

Q 

O 

P 

X 

X 

CM 

CM 

CM 

no 

< 

P 

tH 

rH 

T— 1 

< 

< 

< 

no 

P 

T— 1 

tH 

2 

T— 1 

T— l 

P 

P 

P 

P 

p 

in 

in 

P 

CD 

P 

H 

H 

CD 

CD 

m 

p 

p 

<c 

2 

P 

p 

X 

P 

X 

r- 

00 

Eh 

Eh 

Eh 

Eh 

Eh 

Eh 

Eh 

>H 

P 

P 

P 

p 

P 

P 

< 

< 

< 

p 

P 

P 

p 

p 

P 

2 

P 

p 

P 

P 

P 

P 

P 

P 

P 

X 

P 

P 

P 

p 

P 

P 

CD 

p 

p 

p 

P 

P 

Eh 

p 

P 

P 

>H 

P 

p 

15 


Figure  1 
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The  PU.l  transcription  factor  is  a  member  of  the  ets 
gene  family  of  regulatory  proteins.  These  molecules  play 
a  role  in  normal  development  and  also  have  been  impli¬ 
cated  in  malignant  processes  such  as  the  development  of 
erythroid  leukemia.  The  Ets  proteins  share  a  conserved 
DNA-binding  domain  (the  ETS  domain)  that  recognizes 
a  purine-rich  sequence  with  the  core  sequence:  5'-C/ 
AGGAA/T-3'.  This  domain  binds  to  DNA  as  a  monomer, 
unlike  many  other  DNA-binding  proteins.  The  ETS  do¬ 
main  of  the  PU.l  transcription  factor  has  been  crystal¬ 
lized  in  complex  with  a  16-base  pair  oligonucleotide  that 
contains  the  recognition  sequence.  The  crystals  formed 
in  the  space  group  C2  with  a  =  89.1,  b  =  101.9,  c  =  55.6  A, 
and  P  -  111.2°  and  diffract  to  at  least  2.3  A.  There  are  two 
complexes  in  the  asymmetric  unit.  Production  of  large 
usable  crystals  was  dependent  on  the  length  of  both 
protein  and  DNA  components,  the  use  of  oligonucleo¬ 
tides  with  unpaired  A  and  T  bases  at  the  termini,  and  the 
presence  of  polyethylene  glycol  and  zinc  acetate  in  the 
crystallization  solutions.  This  is  the  first  ETS  domain  to 
be  crystallized,  and  the  strategy  used  to  crystallize  this 
complex  may  be  useful  for  other  members  of  the  ets 
family. 


Transcription  factors  bind  to  target  DNA  sequences  and  reg¬ 
ulate  important  metabolic  functions  such  as  cell  growth,  devel¬ 
opment,  and  differentiation.  The  PU.l  (spi-1,  sfpi-l)  transcrip¬ 
tion  factor  (1)  is  a  member  of  the  ets  gene  family,  a  recently 
discovered  family  of  regulatory  proteins.  There  are  now  more 
than  35  members  in  this  family  that  have  been  identified  in 
various  organisms  from  Drosophila  to  humans  (reviewed  in 
Refs.  2  and  3).  These  molecules  play  a  role  in  normal  develop¬ 
ment  and  have  been  implicated  in  malignant  processes  such  as 
erythroid  leukemia  and  Ewing’s  sarcoma  (4).  The  Ets  proteins 
share  a  conserved  region  of  approximately  85  amino  acids 
known  as  the  ETS  domain  (5)  that  serves  as  a  DNA-binding 
domain  and  recognizes  a  purine-rich  sequence  with  the  core 
sequence,  5'-C/AGGAA/T-3'. 

Ets  proteins  differ  in  size  and  in  the  relative  position  of  the 
ETS  domain.  For  example,  the  domain  is  found  near  the  car¬ 
boxyl-terminal  end  of  the  molecule  in  PU.l  (Ref  1;  see  Fig.  1) 
and  the  ets-1  and  ets-2  proteins  (6,  7),  in  the  middle  of  the 
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sequence  in  Erg  (8),  and  within  the  amino-terminal  region  in 
Elk-1  (9).  The  remaining  sequences  in  Ets  proteins  are  pre¬ 
sumed  to  form  other  functional  domains  such  as  activation 
domains  or  inhibitory  domains  that  mask  the  DNA  binding  site 
(10,  11).^  The  ETS  domain  is  sufficient  for  DNA  binding  and 
binds  to  DNA  as  a  monomer,  unlike  many  other  DNA-binding 
proteins. 

Recently,  the  folding  pattern  of  the  DNA-binding  domain  of 
Fli-1,  an  ets  family  protein,  was  described  by  NMR  analysis 
(12).  The  domain  consists  of  3  a-helices  and  a  four-stranded 
antiparallel  )3-sheet.  Features  of  this  secondary  structure  (13) 
as  well  as  that  of  the  murine  ets-1  domain  (14)  are  very  similar 
to  the  winged  helix-turn-helix  motif  in  DNA-binding  proteins 
such  as  CAP  (15)  and  HNF-3  (16).  In  order  to  define  precisely 
the  protein-DNA  contacts,  we  co-crystallized  the  ETS  domain 
of  the  PU.l  transcription  factor  in  complex  with  cognate  DNA. 

The  PU.l  transcription  factor  is  expressed  in  hematopoietic 
cells  and  specifically  in  B  cells,  macrophages,  neutrophils,  and 
mast  cells  (1,  2).  The  sequence  of  PU.l  is  identical  with  the 
oncogene  Spi-1  (17).  Spi-1  is  activated  in  the  er3d;hroid  leuke¬ 
mia  induced  by  spleen  focus  forming  virus.  Integration  of 
spleen  focus  forming  virus  upstream  of  the  Spi-l/PU.l  gene 
results  in  overexpression  of  the  Spi-l/PU.l  protein.  This  event 
is  associated  with  the  development  of  er3d;hroid  leukemia.  The 
PU.l  molecule  has  been  shovm  to  interact  with  other  nuclear 
proteins.  For  example,  PU.l  binds  to  the  3'  enhancer  sequence 
of  the  Ig-K  gene  in  complex  with  a  second  factor  NF-EM5  (PIP) 
(18,  19).  Formation  of  the  ternary  complex  of  PU.l,  NF-EM5, 
and  DNA  is  dependent  on  PU.l  binding  to  the  core  GGAA 
sequence  and  phosphorylation  of  serine  148  in  PU.l  (18).  The 
sites  of  protein-protein  interaction  and  phosphorylation  are 
immediately  adjacent  and  amino-terminal  to  the  DNA-binding 
domain. 

There  are  several  subfamilies  of  Ets  proteins  that  appear  to 
have  arisen  by  gene  duplication  of  a  primordial  gene  (3).  The 
amino  acid  sequence  of  PU.l  is  the  most  divergent  from  ets-1, 
yet  there  is  40%  sequence  homology  in  the  DNA-binding  do¬ 
mains  of  these  proteins.  Fourteen  residues  are  strictly  con¬ 
served  in  the  DNA-binding  domain  when  all  ETS  domains  are 
compared.  Here  we  report  a  strategy  to  clone  and  express  a 
recombinant  fragment  encompassing  the  ETS  domain  of  PU.l 
for  structural  studies.  Successful  co-crystallization  with  DNA 
was  dependent  on  the  length  of  the  protein  fragment  and  also 
on  the  length  of  the  synthetic  oligonucleotide  bound  to  the 
fragment.  It  has  been  shown  in  studies  of  other  DNA-binding 
proteins  (reviewed  in  Refs.  20-22)  that  alteration  of  the  length 
of  DNA  oligonucleotides  is  important  to  optimize  crystalliza¬ 
tion  of  the  protein-DNA  complex.  Recently,  an  extensive  anal¬ 
ysis  of  conditions  to  produce  crystals  of  the  UlA-RNA  complex 
was  reported  (23).  In  that  study,  varying  the  length  of  RNA 


^  M.  Klemsz  and  R.  A.  Maki.  unpublished  results. 
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Crystals  ofPU.l  ETS  Domain-DNA  Complex 


hairpins  as  well  as  utilization  of  mutant  proteins  was  neces¬ 
sary  to  produce  high  quality  crystals.  The  results  of  the  screen¬ 
ing  of  both  protein  and  RNA  components  were  used  to  propose 
a  general  strategy  for  crystallization  of  protein-RNA  com¬ 
plexes.  Since  this  is  the  first  ETS  domain  to  be  crystallized,  the 
details  of  the  selection  and  production  of  the  protein  and  DNA 
components  of  the  complex  will  be  described  here.  Because  of 
the  strong  sequence  homology  of  the  DNA-binding  domains, 
similar  strategies  may  be  useful  for  successful  crystallization  of 
ETS  domains  from  other  members  of  the  ets  family. 

MATERIALS  AND  METHODS 

Cloning  and  Expression  of  the  PUJ  DNA-binding  Domain  — The 
DNA-binding  domain  of  PU.l  was  cloned  in  the  pETll  expression 
vector  (24)  by  polymerase  chain  reaction  amplification  of  the  DNA- 
binding  domain  from  the  full-length  mouse  PU.l  cDNA  as  described 
previously  (1).  DNA  sequence  analysis  was  used  to  verify  that  the 
sequence  of  the  amplified  product  was  identical  with  the  original  clone. 
For  bacterial  expression,  pET  plasmid  constructs  were  used  to  trans¬ 
form  Escherichia  coli  BL21(DE3)pLysS  cells.  A  preculture  of  50  ml  of 
LB  medium  (25)  and  ampiciilin  (100  pg/ml)  was  inoculated  with  a  single 
colony  from  freshly  transformed  BL21(DE3)pLysS  cells  bearing  the 
DNA-binding  domain  insert.  After  an  overnight  incubation  at  37  °C, 
this  preculture  was  used  to  inoculate  7.5  liters  of  LB-ampicillin  media. 
Cells  were  grown  overnight  at  26  in  an  aerated  fermentor  (Micro- 
ferm.  New  Brunswick,  NJ).  The  next  morning,  2.5  liters  of  LB-ampicil¬ 
lin  buffered  at  pH  7.4  with  sodium  phosphate  were  added  to  the  culture. 
Expression  of  protein  was  induced  at  26  “C  with  the  addition  of  1  mM 
isopropyl- 1-thio-^-D-galactopyranoside.  After  4  h,  cells  were  harvested 
by  centrifugation  and  stored  as  a  paste  at  -70  °C.  . 

Purification  of  PU.l  DNA-binding  Domain -CeW  pellets  from  1  liter 
of  culture  were  resuspended  in  200  mi  of  lysis  buffer  (20  mM  Tris-HCl, 
pH  7.5,  200  mM  NaCl,  2  mM  EDTA,  and  0.1  mM  phenylmethylsulfonyl 
fluoride).  Cells  were  lysed  on  ice  by  sonication,  cell  debris  was  cleared 
by  centrifugation  at  17,000  rpm  and  4  °C  for  60  min,  and  the  concen¬ 
tration  of  sodium  chloride  in  the  supernatant  was  adjusted  to  1  m. 
Polyethyleneimine  was  added  to  a  final  concentration  of  0.2%  and 
precipitation  proceeded  with  gentle  mixing  for  30  min  on  ice.  The 
precipitate  was  removed  by  centrifugation  at  15,000  rpm  and  4  °C  for  30 
min.  The  supernatant  solution  was  dialyzed  at  pH  7.5  against  20  mM 
Tris-HCl,  60  mM  NaCl,  and  0.1  mM  phenylmethylsulfonyl  fluoride  and 
then  centrifuged  again  before  application  to  CM-Sepharose  Fast-Flow 
resin.  The  ETS  domain  was  isolated  by  ion  exchange  chromatography 
at  4  ®C  with  a  linear  NaCl  gradient  (60  mM  to  1.2  M).  Fractions  contain¬ 
ing  the  DNA-binding  domain  were  pooled  and  concentrated  by  ultrafil¬ 
tration.  The  domain  was  purified  to  homogeneity  by  gel  filtration  on  a 
Sephacryl  S-100  (Pharmacia)  molecular  sizing  matrix  at  pH  7.4  in 
phosphate-buffered  saline  and  0.02%  sodium  azide.  Purified  protein 
was  concentrated  to  0.5  mM,  quick-frozen,  and  stored  in  aliquots 
at  -70  ®C. 

Purification  of  Selenomethionine-substituted  Protein  — In  order  to 
produce  modified  protein  for  structure  solution  by  multiwavelength 
anomalous  dispersion  phasing  methods  (26),  recombinant  PU.l  DNA- 
binding  domain  was  produced  with  selenomethionine  substituted  for 
methionine.  Bacterial  cells  {E.  coli  strain  B834;  Novagen,  Inc.)  which 
are  auxotrophic  for  methionine  (BL2lDE3met-)  were  used  to  express 
the  DNA-binding  domain.  Competent  B834  cells  were  freshly  trans¬ 
formed  with  the  pETll  vector  containing  the  domain.  For  expression  of 
the  modified  protein,  a  preculture  of  50  ml  of  LB-ampicillin  medium 
was  inoculated  with  a  single  colony  and  incubated  at  37  °C.  After  16  h, 
5  ml  of  this  preculture  was  used  to  inoculate  1  liter  of  M9  medium  (25) 
containing  100  pg/ml  ampiciilin  supplemented  with  50  p.g/ml  selenome¬ 
thionine  (Sigma)  and  2  mg/liter  each  of  biotin  and  thiamine.  Cells  were 
grown  at  room  temperature  until  the  absorbance  at  reached  0.15, 
and  expression  of  recombinant  protein  was  induced  by  the  addition  of  1 
mM  isopropyl-l-thio-/3-D-galactopyranoside.  After  16  h,  cells  were  har¬ 
vested  by  centrifugation  and  stored  at  —70  °C.  The  selenomethionine- 
substituted  protein  was  purified  by  procedures  described  for  the  native 
domain.  The  extent  of  selenomethionine  substitution  was  evaluated  by 
amino  acid  analysis  and  mass  spectrometry. 

DNA  Synthesis  and  Purification  —  DNA  oligonucleotides  of  various 
lengths  were  synthesized  on  a  lO-pm  scale  using  phosphoramidite 
chemistry  with  an  Applied  Biosysteras  Model  394  DNA/RNA  synthe¬ 
sizer.  Derivatized  oligonucleotides  were  synthesized  by  substituting 
iodinated  uracil  phosphoramidites  (Glen  Research  Laboratories)  for 
thymine  phosphoramidites.  After  the  last  cycle,  the  oligonucleotides 


were  cleaved  from  the  solid  support,  and  protecting  groups  on  exocyclic 
amines  were  removed  by  treatment  with  ammonium  hydroxide  accord¬ 
ing  to  manufacturer’s  protocols  before  lyophilization.  Oligonucleotides 
were  purified  by  reverse  phase  HPLC^  on  a  Vydac  C4  column  at  56  ®C 
using  an  acetonitrile  gradient  in  100  mM  triethyl  ammonium  bicarbon¬ 
ate  buffer  (pH  8.5).  Fractions  containing  the  full-length  oligonucleotide 
were  pooled  and  acetonitrile  was  removed  by  dialysis  against  triethyl- 
ammonium  bicarbonate  buffer.  The  oligonucleotides  were  desalted  in 
20%  ethanol  on  Bio-Gel  P2  resin  (Bio-Rad  Laboratories,  Inc.),  lyophi- 
lized  twice,  and  stored  in  aliquots  at  -70  °C. 

Before  co-crystallization,  DNA  extinction  coefficients  were  calculated 
for  each  oligonucleotide  strand  (27),  and  complementary  strands  were 
mixed  in  equimolar  ratios  in  5  mM  Mes,  200  mM  NaCl,  pH  7.0,  to  a  final 
concentration  of  0.5  mM.  Strands  were  annealed  by  heating  the  mixture 
to  95  “C  and  slowly  cooling  over  a  few  hours  to  20  °C. 

Space  Group  Determination  and  X-ray  Data  Collection  — Crystals 
were  characterized  for  diffraction  using  a  Rigaku  RU-200  rotating  an¬ 
ode  x-ray  source  with  a  graphite  monochromator  operating  at  50  kV  and 
100  mA,  two  San  Diego  Multiwire  Systems  area  detectors,  and  the 
UCSD  data  processing  programs  (28).  Initial  characterization  and 
space  group  determination  were  performed  at  room  temperature;  how¬ 
ever,  the  crystals  were  sensitive  to  x-ray  exposure.  Therefore,  all  crys¬ 
tals  used  for  this  study  were  cryoprotected  in  solutions  of  polyethylene 
glycol  (PEG)  and  methylpentanediol  and  immediately  frozen  in  a  nylon 
loop  in  a  cooled  nitrogen  stream.  X-ray  data  were  collected  at  - 145  ®C 
using  a  cryocooling  device  and  a  liquid  nitrogen- cooled  gas  stream 
(Molecular  Structures,  Inc.). 

RESULTS  AND  DISCUSSION 

Screening  of  Protein  Fragments —  T^v^o  different  recombinant 
proteins  were  generated  that  each  include  the  minimal  DNA- 
binding  domain.  These  fragments  are  shown  in  Fig.  1.  The  two 
fragments  differ  in  length  at  both  the  amino-  and  carboxyl- 
terminal  ends  of  the  sequence.  The  amino-terminal  sequence 
and  amino  acid  composition  of  these  fragments  indicated  that 
the  purified  proteins  lacked  the  amino-terminal  methionine, 
probably  as  a  result  of  proteolytic  cleavage  by  methionyl 
aminopeptidase  (29). 

We  first  generated  a  protein  of  93  amino  acids  corresponding 
to  residues  168  to  260  since  this  region  encompassed  the  min¬ 
imal  DNA-binding  domain  identified  by  deletion  analysis  (1). 
After  expression  and  purification,  when  this  fragment  was 
tested  by  d3Tiamic  light  scattering,  the  protein  solution  was 
monodisperse  (results  not  shown)  which  was  a  preliminary 
indication  that  the  recombinant  molecule  was  suitable  for  crys¬ 
tallization  trials  (30).  However,  when  the  protein  was  concen¬ 
trated  beyond  5  mg/ml,  the  fragment  formed  aggregates  and 
insoluble  precipitates.  Moreover  this  fragment  was  susceptible 
to  proteolytic  degradation  upon  prolonged  storage.  These  ob¬ 
servations  suggested  that  the  fragment  was  not  folded  correctly 
and  that  the  molecule  was  not  a  good  candidate  for  crystalli¬ 
zation.  After  extensive  screening,  no  crystals  were  obtained 
with  this  fragment  alone.  Only  small  crystals  were  observed  for 
this  fragment  in  complex  with  DNA,  and  these  crystals  were 
difficult  to  reproduce. 

In  order  to  generate  a  fragment  with  improved  solubility 
properties,  a  strategy  to  alter  the  length  of  the  molecule  was 
implemented.  The  design  of  a  construct  to  produce  the  longer 
fragment  shown  in  Fig.  1  was  based  on  secondary  structure 
predictions  and  an  alignment  of  multiple  ETS  domain  se¬ 
quences.  This  analysis  indicated  that  the  predicted  secondary 
structure  of  the  sequence  at  the  amino-terminal  boundary  of 
the  short  fragment  was  not  consistent  for  members  of  the  ets 
family.  For  PU.l,  this  region  was  predicted  to  form  an  a-helix, 
while  in  the  majority  of  other  ets  family  sequences,  /3-strands 
were  predicted.  Therefore,  the  amino-terminal  sequence  of  the 
new  construct  was  extended  to  the  boundary  of  the  PEST 


^  The  abbreviations  used  are:  HPLC,  high  performance  liquid  chro¬ 
matography;  PEG,  polyethylene  glycol;  Mes,  2-(7V-morpholino)ethane- 
sulfonic  acid;  bp,  base  pair(s). 
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Fig.  1.  Schematic  representation  of  the  PU.l  protein.  The  sequence  of  the  full-length  protein  encompasses  the  activation  domain,  ^ 
region  and  the  ETS  domain  which  is  located  at  the  carboxyl-end  of  the  molecule  (reviewed  in  Ref.  2).  The  site  of  phosphorylation  {S148)  that 
influences  protein-protein  interactions  is  labeled  (18).  Below  the  molecule,  the  amino  acid  sequences  for  the  termini  of  the  two  recombinant 
fragments  tested  for  crystallization  are  listed.  The  shorter  segment  extending  from  residues  168  to  260  was  cloned  first;  however,  this  fragment 
was  not  a  stable  protein  for  structural  studies.  The  longer  segment  corresponded  to  residues  160  to  272  which  is  the  actual  carboxyl  terminus  ot 
the  full-length  PU.l  molecule.  This  protein  was  extremely  soluble  and  monodisperse  in  solution.  The  amino-terminal  serine  of  this  fragment  results 
from  the  cloning  strategy  and  is  not  part  of  the  wild-type  sequence. 


domain  excluding  a  region  at  the  end  of  the  PEST  region  that 
is  a  conserved  hydrophilic  sequence  (see  Fig.  1).  At  the  carboxyl 
terminus,  the  sequence  was  extended  to  the  end  of  the  full- 
length  PU.l  molecule.  The  long  fragment  encoded  by  this  con¬ 
struct  corresponded  to  residues  160  to  272.  After  expression 
and  purification,  this  fragment  was  remarkably  soluble  up  to 
concentrations  of  60  mg/ml  and  remained  monodisperse  in 
solution  even  at  these  high  concentrations  and  after  prolonged 
storage  at  -70  °C.  Despite  the  optimal  physical  properties  of 
this  fragment,  it  is  surprising  that  the  molecule  never  crystal¬ 
lized  alone  even ‘with  extensive  screening  using  incomplete 
factorial  (31)  and  sparse  matrix  (32)  crystallization  trials. 

Co-crystallization  with  DNA  Oligonucleotides  Some  DNA- 
binding  proteins  crystallize  only  when  complexed  to  specific 
cognate  oligonucleotides  (reviewed  in  Refs.  21-22).  In  many  of 
the  complexes  crystallized  to  date,  the  ends  of  the  DNA  frag¬ 
ments  interacted  in  the  crystal  lattice  to  form  an  extended, 
distorted  DNA  helix  with  base-paired  interactions  between 
adjacent  DNAs  in  the  crystal  lattice.  In  this  respect,  the  oligo¬ 
nucleotides  direct  the  orientation  of  the  complex  in  the  crystal. 
The  PU.l  DNA-binding  domain  recognizes  a  purine-rich  se¬ 
quence  having  a  core  sequence  of  5'-GGAA-3'.  The  sequences  of 
the  oligonucleotides  used  in  this  study  were  identified  by 
screening  random  sequence  oligonucleotides.^  A  number  of  ol¬ 
igonucleotides  were  chemically  synthesized  each  of  which  in¬ 
cluded  the  PU  box  sequence  and  differed  in  length.  As  shown  in 
Fig.  2,  oligonucleotides  with  termini  that  provide  blunt-ended 
or  overhanging  bases  were  tested  for  co-crystallization. 

The  quality  of  the  oligonucleotides  was  critical  for  successful 
co-crystallization.  In  particular,  care  was  taken  to  achieve 
>95%  homogeneous  oligonucleotide  by  reverse-phase  HPLC. 
The  chromatographic  separations  were  run  at  56  "’C  to  avoid 
the  formation  of  secondary  structure  during  purification.  Full- 
length  oligonucleotides  were  eluted  from  the  C4  column  with 
an  acetonitrile-triethylammonium  bicarbonate  gradient.  Puri¬ 
fication  using  other  gradients  or  performed  on  ion  exchange 
resins  did  not  produce  oligonucleotides  that  were  adequate  for 
crystallization.  After  extensive  dialysis  to  remove  acetonitrile, 
each  purified  oligonucleotide  was  concentrated  by  successive 
lyophilizations  from  dilute  ammonium  bicarbonate  and  was 
finally  desalted  in  20%  ethanol  with  a  Bio-Gel  P2  column. 
Complete  desalting  was  critical  for  the  formation  of  large  crys¬ 
tals.  In  fact,  DNA  heterogeneity  or  contaminating  ions  were 
factors  that  inhibited  crystal  growth  or  produced  showers  of 


poorly  formed  crystals.* 

Prior  to  mixing  with  protein,  duplex  DNA  was  annealed  by 
heating  to  95  and  cooling  slowly  to  20  °C.  Molar  extinction 
coefficients  were  calculated  for  each  strand  (22)  to  ensure  that 
the  strands  to  be  annealed  were  present  in  equimolar  concen¬ 
trations.  Duplex  DNA  molecules  shown  in  Fig.  2  were  mixed 
with  freshly  thawed  PU.l  protein  in  molar  ratios  of  2:1  or  1:1 
DNA:protein.  In  each  case,  complex  formation  was  verified 
using  a  gel  shift  electrophoretic  assay  (results  not  shown).  DNA 
binding  was  tested  with  both  of  the  protein  fragments.  Solubil¬ 
ity  testing  and  precipitation  analyses  were  also  performed  with 
selected  complexes  before  crystallization  trials.  The  solubility 
of  the  protein-DNA  complexes  was  diminished  relative  to  the 
proteins  alone,  particularly  as  compared  to  the  longer  PU.l 
fragment.  In  fact,  some  of  the  complexes  precipitated  immedi¬ 
ately  upon  mixing.  These  precipitates  could  be  redissolved  by 
the  addition  of  NaCl  or  could  be  prevented  if  NaCl  was  present 
in  the  protein  solution  prior  to  the  addition  of  DNA.  Optimal 
conditions  for  mixing  PU.l  with  DNA  were  carefully  defined 
yet  were  dependent  on  the  presence  of  NaCl  at  concentrations 
that  varied  for  each  complex. 

PU.  1-DNA  complexes  were  formed  with  each  of  the  oligonu¬ 
cleotides  shown  in  Fig.  2  and  each  of  the  two  PU.l  fragments. 
Using  UV  absorbance  measurements  at  278  nm  for  protein 
components  and  at  260  nm  for  DNA  samples,  the  final  concen¬ 
tration  of  the  complex  was  estimated  at  0.2  mM  to  0.4  mM. 
These  complexes  were  screened  for  crystallization  using  the 
sparse  matrix  method  (32),  starting  with  oligonucleotides  >20 
bp  in  length.  Trials  were  set  up  using  vapor  diffusion  and 
hanging  drops.  In  these  initial  screens,  crystals  grew  from 
conditions  that  are  typical  for  protein-DNA  complexes,  Le,  neu¬ 
tral  pH,  polyethylene  glycol  (PEG),  and  divalent  cations  (33). 

For  complexes  with  the  short  protein  fragment,  only  small 
crystals  were  obtained  in  most  of  the  trials.  In  one  case,  some¬ 
what  larger  crystals  were  observed  when  the  protein  was  com- 
plexed  to  a  20-bp  blunt-ended  oligonucleotide,  but  these  crys¬ 
tals  could  not  be  improved  by  complementary  screening  with 
shorter  oligonucleotides  or  DNAs  with  overhanging  bases.  In 
contrast,  complexes  formed  with  the  longer  protein  fragment 
were  more  amenable  to  screening.  The  best  crystals  for  this 
complex  initially  formed  with  a  23-bp  oligonucleotide  with  an 
AT  overhang  (see  Fig.  2).  Crystals  of  this  complex  were  ob¬ 
served  in  several  drops  of  the  screen.  The  similarity  of  condi¬ 
tions  in  each  of  these  trials  suggested  that  sodium  acetate  was 
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Fig.  2.  Oligonucleotides  tested  in  co-crystallization  trials. 
Each  of  the  oligonucleotides  listed  was  synthesized  for  co-crystallization 
with  the  PU.l  domain.  The  sequences  differ  in  length  and  termini 
flanking  a  core  sequence  shown  in  the  box  at  the  top  of  the  figure.  The 
core  sequence  contains  the  GGAA  recognition  sequence  for  PU.  1  {bold). 
In  each  oligonucleotide,  the  lines  represent  the  repetition  of  this  same 
core  sequence.  The  oligonucleotides  were  designed  to  provide  both 
blunt-ended  duplex  DNA  fragments  and  fragments  that  have  unpaired 
T  or  A  bases  at  the  termini.  The  latter  were  tested  because  they  have 
the  potential  for  end-to-end  stacking  in  the  crystal  lattice.  The  best 
success  with  the  production  of  sizable  crystals  was  achieved  with  two 
oligonucleotides  with  a  5 '-AT  overhang  (marked  with  asterisks).  The 
shorter  of  the  two  fragments,  i.e.  16  bp  in  length,  was  used  to  produce 
diffraction-quality  crystals.  Other  oligonucleotides  with  unpaired  ter¬ 
mini  were  designed  to  permit  Hoogsteen  base-pairing  between  DNA 
fragments  within  the  crystal  lattice.  Although  the  PU.l  DNA  binding 
domain  bound  these  DNA  fragments,  crystals  were  never  obtained  for 
complexes  formed  with  these  oligonucleotides. 

essential  for  crystallization.  Tests  altering  the  pH  and  acetate 
concentration  produced  larger  crystals  of  the  complex  (0.2  X 
0.1  X  0.05  mm)  after  2  months. 

In  order  to  improve  these  crystals,  shorter  oligonucleotides 
were  designed.  Those  with  the  AT  overhang  were  given  priority 
in  the  screening.  When  the  long  protein  fragment  was  com- 
plexed  with  a  16-bp  oligonucleotide  with  an  AT  overhang,  crys¬ 
tals  formed  readily  as  expected;  however,  under  the  conditions 
described  above,  only  crystals  with  an  irregular  morphology 
were  obtained.  With  further  screening,  well-shaped  crystals 
were  produced  in  drops  that  contained  PEG  and  zinc  acetate.  It 
is  interesting  that  a  number  of  the  helix-turn-helix  proteins 
have  been  crystallized  from  PEG  solutions  containing  acetate 
ions.  For  example,  the  heat  shock  factor  was  crystallized  from 
PEG  4000  and  ammonium  acetate  (34),  HNF-3  transcription 
factor  from  potassium  acetate  (without  PEG;  Ref.  16),  NF-kB- 
50-DNA  complex  from  sodium  acetate  and  PEG  8000  (36), 
paired  homeodomain  from  ammonium  acetate  and  PEG  1000 
(37),  and  even-skipped  homeodomain  from  potassium  acetate 
and  PEG  8000  (38).  It  appears  from  this  summary  that  it  is  a 
good  strategy  to  test  the  acetate  ion  in  trials  to  crystallize 
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helix-turn-helix  proteins.  Since  the  presence  of  zinc  acetate 
produced  significant  improvement  of  the  PU.l-DNA  complex,  it 
is  possible  that  both  ions  will  represent  favorable  conditions  for 
crystallizing  ETS  domains.  Evaluation  of  the  general  utility  of 
these  ions  awaits  the  crystallization  of  other  ETS  domains. 

To  our  knowledge,  this  is  the  first  report  of  a  helix-tum-helix 
protein-DNA  complex  crystallized  in  the  presence  of  zinc  ace¬ 
tate.  In  other  families  of  DNA-binding  proteins,  such  as  zinc- 
finger  proteins  (39)  or  the  diphtheria  toxin  repressor  (40),  zinc 
ions  were  necessary  for  crystallization  because  these  molecules 
have  discrete  binding  sites  for  the  zinc  ions  in  coordination 
with  residues  such  as  histidines  or  cysteines.  In  the  case  of  ETS 
domains,  it  is  possible  that  the  zinc  ions  also  stabilize  the 
protein  structure,  but  identification  of  the  sites  for  zinc  binding 
awaits  the  elucidation  of  the  crystal  structure. 

The  PU.l-DNA  complex  crystals  diffracted  to  3.5  A  and  were 
improved  further  by  altering  the  concentration  and  molecular 
weight  of  the  PEG  used  as  precipitant.  Lower  PEG  concentra¬ 
tions  reduced  twinning  and  excess  nucleation.  A  dramatic  im¬ 
provement  in  crystal  morphology  was  achieved  by  substituting 
PEG  600  for  PEG  8000.  For  the  production  of  large  crystals,  5 
pi  of  complex  were  mixed  on  a  siliconized  coverslip  with  5  pi  of 
a  reservoir  solution  containing  100  mM  sodium  cacodylate,  pH 
6.5,  3-10%  PEG  600,  and  200  mM  zinc  acetate.  After  mixing, 
the  coverslips  were  inverted  and  sealed  over  the  reservoir. 
Parallelepiped  crystals  formed  at  19  °C  in  3  to  5  days.  In  some 
cases,  macroseeding  (41)  was  used  to  produce  large  crystals. 
Crystals  were  washed  free  of  mother  liquor,  dissolved,  and 
subjected  to  nondenaturing  gel  electrophoresis  to  confirm  the 
presence  of  complex. 

Diffraction  Analyses— "These  crystals  were  strongly  birefrin- 
gent  and  diffracted  to  at  least  2.3-A  resolution.  However,  the 
crystals  began  to  dissolve  and  crack  when  stored  for  more  than 
1-2  weeks  and  were  very  sensitive  in  the  x-ray  beam.  It  is 
interesting  that  this  instability  is  frequently  reported  for  pro¬ 
tein-DNA  complex  crystals  (21).  Therefore,  crystals  were  flash- 
frozen  before  diffraction  experiments  in  cryoprotectant  solu¬ 
tions  of  8%  PEG  600  and  30%  methylpentanediol.  A  single 
crystal  was  quickly  transferred  from  the  crystallization  drop  to 
the  cryoprotectant  solution,  then  picked  up  in  a  loop  and  im¬ 
mediately  frozen  with  a  cooled  nitrogen  stream.  After  freezing, 
the  crystals  were  extremely  stable  in  the  x-ray  beam  at 
-145  °C  with  no  significant  decay  after  2.5  days  of  data  collec¬ 
tion.  Flash-freezing  did  not  alter  the  space  group  nor  signifi¬ 
cantly  change  the  cell  dimensions  of  the  crystals. 

The  crystals  of  the  PU.l-DNA  complex  belong  to  the  space 
group  C2  with  a  =  89.1,  b  =  101.9,  c  =  55,6  A,  and  jS  =  111.2°. 
Assuming  a  molecular  mass  for  the  complex  of  22,800  daltons, 
calculations  of  the  cell  dimensions  were  consistent  with  (42) 
of  2.58  A^/dalton,  solvent  content  of  48%,  and  two  complexes  in 
the  asymmetric  unit.  These  calculations  were  confirmed  by 
experimental  measurements  of  the  crystal  density  (43).  A  na¬ 
tive  data  (98%  complete)  set  has  been  collected  at  -145  °C  to 
2.3  A  resolution.  The  data  collection  statistics  are  presented  in 
Table  I,  The  diffraction  pattern  displayed  strong  reflections 
near  3.5  A  that  result  from  scattering  of  B-DNA  which  indi¬ 
cated  that  the  DNA  oligonucleotides  lie  approximately  along 
the  h  axis. 

Heavy  Atom  Searches —  Tyfo  approaches  are  being  used  to 
obtain  heavy  atom  substitutions  for  phase  calculation.  The  first 
approach  is  to  covalently  modify  the  protein  and/or  DNA  com¬ 
ponents  of  the  complex  prior  to  crystallization  and  the  second  is 
to  soak  complex  crystals  in  solutions  containing  heavy  metal 
compounds.  In  the  first  strategy,  the  long  PU.l  domain  was 
prepared  as  a  selenomethionine-substituted  protein  by  expres¬ 
sion  of  the  recombinant  molecule  in  bacterial  culture  with 
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Table  I 

Summary  of  data  collection  statistics 


Minimum 

resolution 

Average 

intensity 

Average 

Number  of 
observations 

Number  of 
reflections 

•  p  o 

•^sym 

i 

1 

Kail) 

3.93 

2,898 

48.3 

17,522 

4,063 

0.040 

3.12 

2,287 

36.5 

19,299 

4,103 

0.053 

2.73 

690 

12.1 

9,339 

4,042 

0.079 

2.48 

405 

7.2 

7,256 

3,969 

0.099 

2.30 

289 

4.9 

6,679 

3,928 

0.130 

Totals 

1,327 

22.0 

60,095 

20,105 

0.050 

°  /igym  =  where  is  the  intensity  of  an  individual  measurement  and  {1}  is  the  mean  value  of  its  equivalent  reflections. 


selenomethionine  as  the  sole  source,  of  methionine.  There  are  3 
methionines  in  the  long  PU.l  fragment,  and  substitution  of  the 
3  residues  by  selenomethionine  was  confirmed  by  amino  acid 
analysis  (data  not  shown).  The  extent  of  substitution  was  70- 
86%  complete  in  different  cultures.  The  modified  protein  was 
co-crystallized  in  complex  with  DNA.  Large  diffraction-quality 
crystals  of  this  complex  were  produced  that  are  isomorphous 
with  the  native  crystals. 

In  order  to  modify  the  DNA  for  heavy  atom  substitution, 
halogenated  bases  (i.e.  iodine-substituted  uridine  for  thymine) 
are  suitable  for  multiple  isomorphous  replacement  methods 
(e.g.  Ref.  35).  Several  iodinated  oligonucleotides  were  synthe¬ 
sized  chemically  and  crystallized  in  complex  with  the  DNA- 
binding  domain.  Iodinated  oligonucleotides  were  tested  for 
binding  to  the  PU.l  molecule  by  gel  shift  analyses  before  co¬ 
crystallization.  Large  isomorphous  crystals  were  obtained  with 
several  of  these  modified  oligonucleotides.  Besides  serving  as 
sites  for  heavy  atom  substitution,  the  iodines  may  also  serve  as 
markers  to  orient  the  DNA  in  the  crystal  lattice.  Since  the  axis 
of  the  DNA  is  known  from  the  strong  reflections  in  the  diffrac¬ 
tion  pattern,  the  positions  of  the  iodines  at  different  sites  on 
different  oligonucleotides  should  define  the  direction  of  the 
DNA  in  the  first  electron  density  maps. 

Finally,  crystals  of  the  native  complex  are  being  soaked  in 
heavy  atom  compounds  to  produce  substitutions  for  multiple 
isomorphous  replacement  phase  calculations.  Diffraction  data 
for  complexes  with  modified  protein  and/or  DNA  are  being 
collected  using  flash-frozen  crystals  and  ultra-low  temperature 
data  collection. 

Summary— The  production  of  large  diffraction  quality  crys¬ 
tals  of  the  PU.l  ETS  domain  in  complex  with  DNA  was 
achieved  by  a  strategy  that  combined  varying  the  length  of  both 
the  protein  and  DNA  components  of  the  complex.  The  DNA 
fragments  used  in  this  study  were  critical  to  the  successful 
crystallization  for  several  reasons.  Apparently,  end-to-end 
stacking  of  the  oligonucleotides  is  needed  for  nucleation  of 
crystal  growth  since  the  majority  of  crystals  obtained  were 
from  complexes  with  overhanging  bases.  Furthermore,  the 
length  of  the  oligonucleotide  was  important  since  complexes 
containing  longer  oligonucleotides,  especially  those  in  the 
range  of  20-23  bp,  did  not  diffract  strongly,  probably  as  a 
result  of  spacious  unoccupied  volumes  in  the  crystal  lattice.  It 
is  interesting  that  the  optimal  length  for  the  DNA  was  16  bp 
which  corresponds  to  the  length  of  DNA  protected  from  nucle¬ 
ase  cleavage  in  footprint  analyses  (1). 

While  the  shorter  DNA  oligonucleotides  were  best  for  crys¬ 
tallization,  the  longer  protein  fragment  exhibited  the  ideal 
physical  properties  for  solubility,  DNA  binding,  and  complex 
crystallization.  It  is  possible  that  there  is  an  ideal  ratio  of  size 
of  protein  to  length  of  DNA  for  successful  crystallization.  This 
ratio  relates  directly  to  the  shape  of  the  protein  component, 
rather  than  the  oligonucleotide,  because  the  overall  shape  of 
the  B-DNA  is  regular  and  cylindrical.  In  cases  where  end-to- 


end  stacking  occurs  in  the  crystal,  the  DNA  forms  elongated 
“fiber-like”  features  arranged  side-by-side  in  the  lattice.  Since 
the  protein  component  is  usually  globular,  packing  of  the  bound 
protein  within  the  lattice  formed  by  neighboring  DNA  oligonu¬ 
cleotides  is  important  for  growth  of  a  three-dimensional  crys¬ 
tal.  With  the  parameters  reported  here  and  homology-based 
sequence  alignments,  it  may  be  possible  to  design  similar  pro¬ 
tein  and  DNA  fragments  to  crystallize  other  ETS  domains. 
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The  Ets  family  of  transcription  factors,  of  which  there  are  now 
about  35  members regulate  gene  expression  during  growth  and 
development.  They  share  a  conserved  domain  of  around  85  amino 
acids^  which  binds  as  a  monomer  to  the  DNA  sequence  5'-C/ 
AGGAA/T-3'.  We  have  determined  the  crystal  structure  of  an  ETS 
domain  complexed  with  DNA,  at  2.3-A  resolution.  The  domain  is 
similar  to  a  +  p  (winged)  ‘helix-turn-helix’  proteins  and  inter¬ 
acts  with  a  ten-base-pair  region  of  duplex  DNA  which  takes  up  a 
uniform  curve  of  S'".  The  domain  contacts  the  DNA  by  a  novel 
loop-helix-loop  architecture.  Four  of  the  amino  acids  that 
directly  interact  with  the  DNA  are  highly  conserved:  two  arginines 
from  the  recognition  helix  lying  in  the  major  groove,  one  lysine 
from  the  ‘wing’  that  binds  upstream  of  the  core  GGAA  sequence, 
and  another  lysine,  from  the  ‘turn’  of  the  ‘helix-turn-helix’ 
motif,  which  binds  downstream  and  on  the  opposite  strand. 

The  PU.l  [Spi-l,  Spfi-1]  transcription  factor  is  an  Ets  protein 
expressed  in  haematopoietic  cells^'\  PU.l  is  a  regulatory  protein 
for  differentiation  of  monocytes  and  macrophages  and  for  B-cell 
maturation  (reviewed  in  ref.  2).  The  ETS  domain  of  PU.l  was  co¬ 
crystallized  with  a  16  base-pair  oligonucleotide  containing  the 
recognition  sequence^  The  structure  was  solved  by  the  multiple 
isomorphous  replacement  and  anomalous  scattering  (MIRAS) 
method  (Table  1).  The  electron  density  was  clearly  defined  (Fig.  1) 
for  residues  171  to  258,  which  encompasses  the  entire  conserved 
ETS  domain.  The  PU.  1  domain  assumes  a  tight  globular  structure 
(33  X  34  X  38  formed  by  three  a-helices  and  a  four-stranded 
antiparallel  p-sheet  (Fig.  1).  The  domain  topology  is  similar  to  the 
structures  of  other  Ets  family  proteins  Fli-1  (ref.  7),  murine  Ets-1 
(ref.  8)  and  human  Ets-1  (ref.  9)  determined  in  solution  by  NMR. 
The  structural  studies  revealed  a  common  folding  pattern  for  ETS 
domains  that  is  similar  to  a  +  p  helix-turn-helix  (HTH)  DNA- 
binding  proteins  including  CAJP‘^'  and  resembles  'winged’  HTH 
proteins  such  as  GH5  (ref.  ll),HNF-3Y(ref.  12)  andHSF(ref.  13). 
There  are  three  sites  of  protein-DNA  contact:  the  recognition 
helix  (a3),  the  loop  between  p-strands  3  and  4  (a  ‘wing’)  and  the 
turn  in  the  HTH  motif  (a2-turn-a3).  The  turn  between  a2  and  a3 
is  longer  than  the  equivalent  in  many  other  HTH  proteins,  and  is 
actually  a  loop.  The  DNA-binding  motif  in  PU.l,  and  probably 
other  members  of  the  Ets  family,  can  be  described  more  appro¬ 
priately  as  a  loop-helix-loop  motif.  Therefore  the  large  Ets 
family  defines  a  new  variant  subclass  of  the  helix-turn-helix 
DNA-binding  proteins  with  a  novel  mode  of  DNA  recognition. 

The  protein-DNA  contacts  in  the  PU.l  complex  are  detailed  in 
Fig.  2.  Four  strictly  conserved  residues  on  the  surface  of  the 
domain  are  likely  to  be  important  for  DNA  binding  by  all 
members  of  the  Ets  family.  Arg232  and  Arg235  emanate  from 
helix  a3  and  contact  bases  in  the  GGAA  sequence  in  the  major 
groove.  These  contacts  represent  the  core  structure  for  DNA 
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recognition  by  members  of  the  Ets  family  because  they  involve 
both  strictly  conserved  amino  acids  and  bases  in  the  consensus 
sequence  recognized  by  these  transcription  factors  (see  Fig.  3b). 
The  equivalent  arginines  81  and  84  in  Ets-1  (ref.  9)  do  not  contact 
the  GGAA  bases,  but  intermolecular  nuclear  Overhauser  effects 
between  these  arginines  and  DNA  were  observed  in  the  Fli-1 
NMR  studies^  Lys245  extends  from  pS  just  adjacent  to  the  loop 
(‘wing’),  and  Lys219  is  located  in  the  ‘loop’  of  the  HTH  motif. 
Lys  245  contacts  the  phosphate  backbone  of  the  GGAA  strand  in 
the  minor  groove  upstream  from  the  core  sequence  (Fig.  3c)  and 
Lys  219  forms  a  salt  bridge  with  the  phosphate  backbone  of  the 
opposite  strand  downstream  of  the  GGAA  core  (Fig.  3d).  Sub¬ 
stitutions  of  glycine  at  each  of  these  four  conserved  sites  abolished 
DNA  binding,  confirming  the  functional  importance  of  these 
contacts  (see  Fig.  2). 

Mutations  of  conserved  residues  that  contact  the  phosphate 
backbone  also  affect  DNA  binding.  Substitution  of  glycine  at 
Leu  174  or  Trp215  abolished  DNA  binding  in  PU.l.  Similarly, 
substitution  of  any  amino  acid  in  Ets-1  (ref.  14)  at  the  equivalent 
of  PU.l  residues  Lys 219  and  Arg222  that  bind  the  phosphate 


backbone  disrupted  DNA  binding.  These  minor-groove  contacts 
might  represent  a  conserved  pattern  for  protein  ‘docking’  in  the 
Ets  family.  In  Fli-1  (ref.  7),  the  equivalents  of  Leu  174,  Lys  219  and 
Lys  222  showed  large  chemical  shifts  on  DNA  binding  in  the  NMR 
studies  (the  counterpart  of  Trp  215  was  buried). 

Water  molecules  also  participate  in  protein-DNA  recognition 
in  the  PU.l  complex  (Fig.  2).  There  are  27  well-ordered  solvent 
molecules  around  the  DNA.  Solvent  molecules  in  the  major 
groove  are  hydrogen-bonded  to  the  bases  and  also  form  a  hydro- 
gen-bonded  network  between  the  two  strands  that  might  contri¬ 
bute  to  the  stability  of  the  duplex  and  consequently  influence 
specific  DNA  recognition.  Conserved  Arg232  and  Arg235  each 
form  direct  and  water-mediated  contacts  with  the  bases.  Three 
other  residues  also  contact  DNA  bases  through  water  molecules: 
Thr  226,  Gin  228  and  Asn  236.  These  residues  are  not  conserved  in 
the  Ets  family  and  might  represent  interactions  that  are  unique  to 
the  PU.l  protein.  Thr  226  and  Gin  228,  at  the  amino-terminal  end 
of  helix  a3,  make  water-mediated  contacts  with  bases  C25  and  C26 
respectively  that  are  base-paired  to  guanines  8  and  9  in  the  core 
sequence. 


TABLE  1 

Structure  determination  and  refinement 

Native 

Hg 

/(29) 

/(13) 

/(31) 

Phasing  statistics 

Resolution  (A)  2.3 

3.0 

2.9 

3.0 

2.8 

Observed  reflections  60,095 

25,081 

20,709 

20,512 

23,308 

Unique  reflections  20,105 

14,902 

13,258 

12,910 

15,397 

Completeness  (%)  97 

79 

65 

69 

68 

Rsyn,  (%)*  5-0 

3.6 

4.0 

4.3 

3.6 

6,3,  (%)  to  3.0  At 

13.0 

14.4 

15.9 

13.0 

Number  of  sites 

2 

2 

2 

2 

For  isomorphous  data  (//f7  ^  3) 

Phasing  powerj 

1.33 

1.76 

1.04 

0.98 

To  resolution  (A) 

3.0 

3.0 

3.0 

3.0 

^Cullis§ 

0.62 

0.57 

0.68 

0.67 

For  anomalous  data  {I /a  ^  3) 

Phasing  power^ 

1.0 

1,41 

1.13 

1.43 

To  resolution  (A) 

3.0 

3.0 

3.0 

3.0 

Mean  figure  of  merit  (10-3.0  A)  is  0.65. 

Refinement  statistics 

Resolution  range 

8-2.3A 

Average  6  (A^) 

20.1 

Crystallographic  R-factor  (%) 

23.7 

R^e  (%)^" 

29.9 

Number  of  reflections  used 

16,898  F  >  3(t(F) 

Number  of  protein  atoms 

1,486 

Number  of  DNA  atoms 

1,300 

Number  of  solvent  atoms 

88 

The  crystallization  of  the  PU.l  ETS  domain  (residues  160-272)  with  a  16-bp  synthetic  DNA  oligonucleotide  containing  the  recognition  sequence  was 
described  previously®.  Crystals  formed  in  the  space  group  C2  with  a  =  S9.1,b  -  101.9,  c  =  55.6  A  and  p  =  111.2°,  with  two  complexes  in  the  asymmetric 
unit.  Phase  determination.  Four  heavy-atom  derivatives  were  prepared  by  soaking  crystals  of  the  native  complex  and  by  co-crystallizing  iodinated 
oligonucleotides  with  the  PU.l  domain.  The  locations  of  the  iodinated  bases  are  indicated  in  Fig.  2.  Multiple  isomorphous  replacement  phases,  including 
anomalous  data,  were  calculated.  The  package  PHASES^^  was  used  to  refine  heavy-atom  positions,  B-factor/occupancies  and  to  calculate  phases  to  3.0-A 
resolution  with  an  overall  figure  of  merit  of  0.65.  The  initial  MIRAS  map  (3.0  A)  was  improved  by  solvent  flattening  by  the  method  of  Wang^®  and  with  non- 
crystallographic  density  averaging.  Model  building  and  refinement.  The  improved  MIRAS  electron-density  map  was  used  to  build  the  mode!  with  the 
interactive  graphics  programs  TOM  based  on  FRODO^®  and  0^®.  The  density  forthe  DNA  helix  was  a  prominent  feature  of  the  map.  To  fit  the  DNA,  an  ‘ideal'  B- 
DNA  duplex  was  generated  with  the  program  QUANTA  (Molecular  Simulations,  Inc.)  and  fitted  to  the  density  as  a  rigid  body.  After  the  DNA  was  positioned,  a 
polyalanine  chain  was  constructed  with  the  BONES  option  of  the  Alberta/Caltech  program  TOM.  Subsequently  side  chains  for  all  residues  with  clear  electron 
density  were  added  to  the  model.  There  were  11  disordered  residues  at  the  N  terminus  of  the  domain  and  14  disordered  residues  at  the  C  terminus  so  these 
amino  acids  were  not  included  in  the  model.  For  all  other  residues  representing  the  complete  ETS  domain,  the  electron  density  was  clear  (see  Fig.  1)  and 
allowed  unambiguous  fitting  of  both  backbone  and  side-chain  atoms.  Manual  adjustments  of  individual  DNA  bases  were  made  to  fit  the  electron  density.  In 
the  program  X-PLOR^^,  the  stereochemistry  of  the  protein  was  optimized  to  bond  and  angle  parameters  developed  by  Engh  and  Hubei^^  and  for  DNA  by  using 
parameters  of  Parkinson  eta/.  Weak  restraints  were  placed  on  all  ribose  conformations.  One  cycle  of  simulated  annealing  at  3,000  K  (ref.  24)  was  followed 
by  cycles  of  manual  model  building,  positional  refinement  and  6-factor  refinement.  More  data  were  added  as  the  refinement  progressed  in  increments:  3, 
2.8, 2.6  and  2.3  A.  A  total  of  88  solvent  oxygens  ((6)  =  22  A^)  have  been  added  to  the  model  at  this  stage  of  the  refinement.  Main-chain  torsion  angles  for  all 
non-glycine  residues  fall  within  energetically  favourable  Ramachandran  boundaries^®.  The  r.m.s.  difference  for  84  a-carbon  atoms  in  the  two  complexes  in  the 
asymmetric  unit  is  0.35  A. 

*Rs^isEI/-(OI/E(/>- 

tBiso  is  E  II^phI  “  |Fp|  /  E  !^pI»  where  |Fp|  and  |Fph|  are  structure  factors  for  the  protein  and  derivative,  respectively. 

t  Phasing  power  is  the  r.m.s.  value  of  |Fh|  /E,  where  E  is  residua!  lack  of  closure. 

§Rcuiiis  =  E  I  i  I  ±  I  Fp|  -  i  F„,,a,c,  II  /  E !  FpH  -  Fp|  for  centric  reflections,  where  is  the  calculated  heavy-atom  structure  factor. 
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FIG.  1  Overall  structure  of  the  PU.l-DNA  complex,  a.  Stereoview  of  the  refined  2.3-A  (1.5  o-) 
2 1  Fo  I  ~  I  Fc  I  electron-density  map  at  the  protein-DNA  interface.  Interactions  of  protein  (gold),  DNA 
(red)  and  water  (white)  in  the  major  groove  at  the  GGAA  core  recognition  sequence  are  shown.  The  two 
strictly  conserved  residues,  Arg232  and  Arg235  from  recognition  helix  a3,  make  direct  contact  with 
bases  in  the  major  groove.  There  is  a  tight  network  of  water  molecules  at  this  site  in  the  major  groove  (Fig. 
2).  b,  Ribbon  drawing  of  the  PU.l  ETS  domain.  The  module  (residues  171-258)  is  composed  of  three  a- 
helices  and  a  four-stranded  antiparallel  p-sheet.  In  the  interior  of  the  domain,  a  hydrophobic  core  is 
formed  with  19  side  chains  including  seven  strictly  and  eight  highly  conserved  residues.  The  major 
structural  features  that  contact  the  DNA  are  indicated:  the  recognition  helix  a3  (h),  the  turn  in  the  HTFI 
motif  (t)  and  the  loop  between  p-strands  3  and  4  (w)  corresponding  to  the  ‘wing'  in  these  proteins.  At  the 
N-terminat  end  of  the  fragment,  helix  al  begins  at  residue  172.  The  C-termina!  segment,  which  is 
disordered  in  the  PU.l-DNA  complex,  assumes  an  a-helical  conformation  in  the  unbound  Ets-1  NMR 
structure®.  This  segment  might  unfold  in  PU.l  with  DNA  binding,  c,  Space-filling  model  of  the  PU.l  ETS 
domain-DNA  complex.  Protein-DNA  interactions  include  both  major  and  minor  groove  contacts  over  a 
distance  of  30  A.  The  PU.l  transcription  factor  (gold)  binds  to  DNA  as  a  monomer,  so  it  is  not  surprising 
that  extensive  DNA  contact  sites  exist  in  addition  to  the  recognition  sequence  to  stabilize  binding.  HNF-3y 
(ref.  12)  and  GH5  (ref.  11)  also  bind  to  target  DNA  as  monomers,  in  the  HNF-3y-DNA  complex,  three 
regions  were  involved  in  DNA  recognition:  the  recognition  helix  and  two  ‘wings'.  The  location  of  the  first 
‘wing’  between  the  last  two  strands  in  the  p-sheet  corresponds  topologically  to  the  ‘wing’  in  PU.l,  but 
contacts  from  the  second  ‘wing’  emanate  from  a  loop  at  the  C  terminus  of  the  domain.  The  structural 
equivalent  to  the  second  ‘wing’  is  absent  in  PU.l. 


FIG.  3  PU.l-DNA  complex,  a,  The  16-bp  oligonucleotide 
bound  in  complex  to  the  PU.l  ETS  domain  is  shown  in  grey, 
with  the  GGAA  sequence  coloured  red.  The  ETS  domain  is 
represented  by  an  orange  ribbon  model  with  the  side  chains 
for  four  conserved  residues  that  contact  DNA  shown.  When 
glycine  was  introduced  at  each  of  these  sites,  DNA  binding 
was  lost  (Fig.  2).  b.  Detailed  close-up  view,  showing  that 
Arg232  and  Arg235  from  the  recognition  helix  make  hydro¬ 
gen  bonds  with  the  bases  GGA  of  the  PU  core  sequence. 
Arg235(NFi2)  forms  a  hydrogen  bond  with  G8(06),  whereas 
Arg232(NFIl)  makes  hydrogen  bonds  with  two  bases 
G9(06)  and  A10(N6).  These  arginines  are  strictly  conserved 
in  all  members  of  the  Ets  family,  and  the  GGA  sequence  is  the 
consensus  DNA  sequence  recognized  by  the  Ets  proteins. 
Therefore  the  interactions  shown  here  represent  the  para¬ 
digm  for  Ets  recognition,  which  is  expected  to  be  reproduced 
in  all  Ets  protein-DNA  complexes,  c.  Interaction  of  the  ‘wing’: 
Lys245(NZ)  contacts  the  phosphate  backbone  at  G6(02P). 
d,  Interaction  of  Lys  219(NZ)  from  the  loop  in  the  HTFI  motif, 
which  contacts  the  phosphate  backbone  at  C22(03P)  and 
T23(02P).  This  figure  was  generated  with  the  graphics 
program  QUANTA  (Molecular  Simulations,  Inc.). 
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FIG.  2  Protein-DNA  contacts  in  the  PU.l-DNA  complex,  a,  Backbone  of 
the  PU.l  ETS-domain-DNA  complex.  The  DNA  is  bent  by  8°  from  canonical 
B-DNA  structure  and  curved  nearly  uniformly  along  the  entire  16  bp. 
Analysis  of  the  DNA  structure^^'^®  demonstrated  an  average  helical  twist 
of  33%  an  average  rise  per  base  pair  of  3^.2  A  and  10.8  bp  per  turn.  The 
minor  groove  is  slightly  enlarged  (~  2-3  A  from  the  mean)  in  the  GGAA 
(bold)  region  at  the  midpoint  of  the  oligonucleotide.  In  the  Ets-l-DNA 
comptex%  a  60°  kink  is  induced  between  base  pairs  6  and  7  by  intercalation 
of  the  side  chain  of  Trp28.  The  equivalent  of  this  tryptophan,  Tyrl75  in 
PU.l,  shown  in  the  model,  is  located  in  the  hydrophobic  core,  excluding  the 
possibility  for  intercalation  with  the  DNA  bases.  Substitution  of  glycine  for 
this  tyrosine  did  not  affect  DNA  binding.  Furthermore  the  site  of  intercalation 
in  the  Ets-l-DNA  complex,  base  pairs  6  and  7,  is  located  at  the  opposite 
extreme  of  the  DNA  duplex,  upstream  of  the  GGAA  core  sequence,  b, 
Sequence  ofthe  oligonucleotide  bound  to  the  PU.l  protein  (GGAAPU  box  in 
bold  lines).  Residues  that  contact  the  DNA  through  main-chain  atoms  are 
underlined.  Well-defined  solvent  molecules  located  within  3.2  A  of  protein 
or  DNA  atoms  are  identified  by  an  encircled  W.  Contacts  from  residues  ofthe 
‘wing’  are  made  with  the  nucleotides  upstream  ofthe  GGAA  sequence,  and 
residues  from  the  loop  in  the  FlTFi  motif  interact  with  the  opposite  strand, 
downstream  ofthe  GGAA  site.  The  direction  ofthe  DNA  was  confirmed  by 
the  location  ofthe  three  iodinated  bases  (13,29,31;  black  dots)  used  for 
phase  calculation.  Seven  of  the  residues  that  contact  DNA  are  strictly 
conserved  and  four  others  are  highly  conserved,  c.  Sequence  alignment  of 
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the  PU.l  and  Ets-1  ETS  domains,  representing  extremes  of  evolutionary 
divergence  in  the  family.  Residues  strictly  conserved  in  all  Ets  proteins  are 
shown  in  black  boxes;  dashes  indicate  gaps  within  the  family.  Numbering 
and  secondary  structural  features  ofthe  PU.l  domain  are  indicated.  The 
results  of  mutational  analysis  when  glycine  was  substituted  for  a  residue  are 
also  shown.  The  effects  ofthe  interchanges  are  labelled  -h  or  -  above  the 
sequence,  indicating  that  DNA  binding  was  retained  or  abolished.  Muta¬ 
tions  were  generated  essentially  as  described^% 


The  turn  in  the  HTH  motif  is  actually  a  loop,  and  because  the 
sequences  in  this  loop  as  well  as  the  loop  (‘wing^)  between  strands 
P3  and  (34  are  not  strictly  conserved  among  members  of  the  Ets 
family,  these  residues  might  be  important  sites  for  specific  recog¬ 
nition  by  individual  members  of  the  family.  In  fact,  the  lengths  of 
both  of  the  contact  loops  differ  among  members  of  the  family,  with 
the  PU.l  loop  containing  an  ‘extra’  glycine  at  residue  220  and 
lacking  a  glycine  after  residue  247.  Such  conformational  differ¬ 
ences  are  expected  between  family  members,  but  the  contrast 
between  the  PU.l  and  Ets-1  complexes  was  unexpected.  The 
striking  distinction  in  the  mode  of  DNA  contact  by  the  PU.l 
and  Ets-1  domain  could  reflect  extreme  evolutionary  divergence 
between  members  of  the  Ets  family.  Alternatively,  it  should  be 
noted  that  the  Ets-1 -DNA  complex  was  formed  under  denaturing 
conditions^'^^  and  it  is  possible  that  the  Trp  intercalation  occurred 
early  during  the  renaturation  step  with  subsequent  protein 
refolding. 
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Future  extensive  mutational  studies  of  amino  acids  that  contact 
DNA  in  Ets  proteins  are  needed  to  identify  residues  that  mediate 
recognition  of  a  specific  DNA  sequence  by  a  given  family  member. 
Ultimately,  crystal  structures  of  other  Ets  proteins  complexed  to 
DNA  must  be  compared  to  distinguish  unique  DNA  contacts.  □ 
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